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ABSTRACT 

A study of data compression techniques involving linear interpolation 
and linear prediction showed that redundancy is a problem that can be 
Significantly reduced by various polynomial approximations. A more 
recent compressor, the continuous secant compressor which determines 
the optimum sampling interval prior to sampling, was found to be the most 
efficient compressor examined. The continuous secant compressor bases 
its reduction technique ona Straight-line approximation. Data compres- 
Sion results when the system in question does not occupy its entire 
bandwidth. The addition of white noise over the entire bandwidth was 
found to reduce the efficiency of the continuous secant compressor by 
only a small amount. The probability distribution of the straight-line 
approximation in the presence of noise had a gaussian distribution and a 


relatively small standard deviation. 
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La INTRODUCTION 

A. DATA COMPRESSION 

Millions of dollars are being spent every year storing, processing 
and transmitting repetitious video and telemetry data. [Ref. 1] As a re- 
sultcf research in such fields as space exploration and television, there 
is an increasing demand for a more efficient means for handling this 
redundant data. The growth of data from telemetry systems has increased 
exponentionally in the past few years and is expected to increase even 
more in the future. Much concentrated effort has been done in designing 
a fairly simple and reliable means of transmitting only the significant 
changes in data instead of processing all the generated data. This 
research has produced a number of techniques which make up the field 
mata compression. For many years this extraction ore TU a 
had been accomplished by human analysts. Today data compression is 
being simulated by electronic devices but the implementation of these 
techniques has not been thoroughly established, even though it is gaining 
significant recognition as the technology which can: (1) reduce physical 
and power requirements for a spacecraft system, (2) increase ground 
communication data flow and (3) reduce computer costs and data storage. 

A good application of a data compressor and its possible attributes 
would be to a pulse-code-modulated(PCM) telemetry system, from a 
spaceborne sensor to the user on the ground (Ref. 2]. The data compres- 
sor could be incorporated into either the spaceborne transmitting system 


or the ground receiving system. Diagrams of both possibilities appear 


m 


in Fig. 1.1. The advantages of a data compressor in a space vehicle 
are as follows: (1) By reducing the number of samples from all the gene- 
rated samples the transmitter power or transmitter bandwidth may be 
reduced; (2) By maintaining the same transmitter power more samples can 
be transmitted, thus increasing the information rate; and (3) The operating 
range can be increased due to the reduced system bandwidth. By placing 
the data compressor in the ground system the following benefits could be 
achieved (These also apply to the spaceborne system.): (1) Data com- 
pression could allow for a reduction in electronic processing equipment; 
and (2) Data compression could allow a given amount of information to be 
represented by fewer symbols which in turn reduces the time required for 


processing the data, 


B. CLASSIFICATIONS OF DATA COMPRESSORS 

Data compressors may be classified into four basic categories. The 
complete classification is shown in Fig. 1.2 [Ref. 3]. Parameter extrac- 
tion reduces the bandwidth required to transmit a given sample by means 
of an information-describing irreversible transformation [Ref. 4]. 
Examples of parameter extraction techniques are phase comparitors, 
spectrum analyzers and peak detectors. A second method is adaptive 
sampling which is the technique that synchronizes the sampling rate with 
the rate of data activity. The third method, encoding, transforms a 
message into coded words. Examples of encoding techniques include 
delta modulation and probability descriptions. The last category is the 


primary subject of this thesis, redundancy reduction. Redundancy 
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reduction eliminates redundant samples by comparing previous or succeed- 


ing samples with arbitrary reference patterns. 


C. REDUNDANCY REDUCTION 

Shannon defines redundancy as "that fraction of a message or datum 
which is unnecessary and hence repetitive in the sense that if it were 
missing the message would still be essentially complete or at least could 
be completed,"* There are two basic methods employed under the heading 
of redundancy reduction: prediction, which determines redundancy by 
comparing predicted samples to actual samples; and interpolation, which 


determines the redundancy of past samples prior to transmission. 


*Kortman, C. M., "Data Compression by Redundancy Reduction, " 
DELL »pecuwu v. 1D l4 March 1967 
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II, SYNCHRONOUS SAMPLING 
A. INSRODUCTION 

Data compression techniques can be divided into two classes; those 
which destroy the time reference and transmit the significant samples ata 
constant rate, and those which transmit only significant samples as they 
occur in time. The first of these methods is termed synchronous sampling. 
Due to the fact that the interval between samples is not constant, addi- 
tional information is needed to identify the time base. This additional 
information must be applied to digital words and therefore cannot be 
implemented on analog input signals. 

Synchronous compressors have been simulated basically for tele- 
vision bandwidth compression studies. Many of the techniques used in 
prediction and interpolation have been simulated for television bandwidth 
compression [Refs. 5, 6, 7, 8]. Interpolation and prediction techniques 
have been applied to video information with the first-order interpolator 
giving the best results, yielding compression ratios in the order of two 


to four (Ref. 7]. The compression ratio is defined as: 


CR = Total number of samples (1-1) 
Number of significant samples 


B. CHANGE TRANSMISSION 

If the transmitter and the receiver use identical prediction techniques, 
only changes from the predicted values need be transmitted in order to 
reconstruct the data set, rather than transmitting the values themselves, 


A high sampling rate would give small changes in the signal thus reducing 
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the amount of information to be transmitted; but for an input signal with 
Significant changes between samples, the compressor would have little 


or no effect on the amount of data to be transmitted, 


C. VARIABLE SAMPLING RATE 

A significant improvement in system efficiency can be obtained by 
Sampling at higher rates during high data activity and at lower rates 
during relatively inactive periods. In order to employ this type of a 
compressor in a system, the periods of high and low data activity must 
be known a priori, or a timing mechanism must be incorporated into the 
compressor which would act as a switch for varying the sampling rate, 

A sporadic input signal could put serious restrictions on the system. 
The efficiency of such a system would largely depend upon the hardware 
employed in the switching circuit. A constantly changing signal would 
require a constant high sampling rate in order to reproduce the signal with 
any degree of accuracy. Data compression results from not transmitting 
during relatively inactive periods of signal activity; therefore, if the sig- 
nal contains no inactive periods every sample will be transmitted and 


data expansion could result. 


D. DELIA MODULATION 

A continuous signal can be reconstructed by transmitting a series of 
pulses of equal width and magnitude that differ only in sign (Ref. 9]. At 
each succeeding sample point the change in the input signal is approxi- 
mated by adding or subtracting these pulses to the previous data point. 


Data compression results from transmitting only the series of pulses or 


1 


changes in the signal, The basic restriction in delta modulation is 
similar to the problem that occurs in change transmission; a high sampling 
rate is needed to reconstruct high-frequency signals which may result in 


data expansion. 
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IIT, PREDICTORS 
A. INTRODUCTION 

The prediction technique requires that one have knowledge of past 
samples. This a priori knowledge can be obtained from either previous 
experiments or from recent samples generated by the experiment in 
question. The history of past samples used to predict a time-varying 
function is limited only by the hardware employed in the predictor and the 
predictor efficiency. Evidence has shown that a complex system need 
not be the most efficient. Studies have shown that polynomial methods 
are extremely efficient and fairly easy to implement. The block diagrams 
of a simple predictor used as a data compressor are shown in Fig. 3.1 
andi kig. 3.2 Rer EN 

The predictor estimates the value of the next sample by comparing 
previous data samples, using a predefined method. If the difference 
between the predicted sample and the actual sample is within certain 
error limitations placed on the system, then the sample is not transmitted 
because it is redundant. Predictiontechniquesscan be Nneamenmen= 
linear, 

Linear prediction techniques assert that the predicted sample will 
lie on an nth-order polynomial. This concept can be described by the 
following equation [Ref. 2]; 

=X + AM AS ra (3890) 
where 


X. = predicted sample value at time t 
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Aen = previous data sample at time t-1 

DR =X, — Kyo 

A? X4 = AX 7 AXio 
mile. 


= Xt T 24X-e t Xi-a 
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m. ZERO-ORDER PREDICTORS 

The simplest predictor is one in which a constant value is to be mon- 
itored. A good example of a constant value which could be incorporated 
into a data compression system is the temperature of a space capsule, 
which need not be constantly monitored if an efficient data compressor 
is used. The temperature would be transmitted only if the change was 
significant. For the zero-order predictor n=0 and equation (3-1) reduces 
to 

Ay = AN (3-2) 

which means that the predicted sample should be identical to the previous 
sample. To implement a constant value predictor, all that is needed is a 
delay element and a comparator. If for the constant-value predictor, the 
constant value were to increase or decrease slowly, then the errors would 
go undetected, and the system could go unstable. To avoid this contin- 
uous error from the redundant samples X,., refers only to the last trans- 
mitted sample. 

The predictor which predicts that each sample will be the same as 


the previous transmitted sample is called a Zero-order predictor 
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(Ref. 2, 3, 4, 9, 10, 11]. The fixed-aperture zero-order predictor 
compares each sample to the limits of the predefined error. If the sample 
lies outside these limits then it is transmitted and the next sample is 
compared to a new set of limits. The new error ine are the original 
limits shifted so that the new upper limit is the original lower limit or the 
new lower limit is the original upper limit. The direction of the shift 
depends on whether the sample exceeded the original lower bound or the 
original upper bound. The use of the new sample as the basis for suc- 
ceeding samples is termed floating-aperture prediction. Itis similiar to 
the fixed-aperture zero-order predictor except that the tolerance limits 
are placed about each significant sample as it occurs. Both zero-order 
predictors predict that each sample will be the same as the preceeding 
sample, or in other words, they try to fit the data with a horizontal 
straight line. The floating-aperture and the fixed-aperture predictors 

are shown in Fig. 3.3 and 3.4 respectively, 

There is one other zero-order predictor which is a version of the 
floating-aperture predictor where a priori knowledge of an established 
trend is used to make the data compressor more efficient. The difference 
between this zero-order predictor and the previous floating-aperture zero- 
order predictor is that each sample is offset by a predetermined amount, 
the sign of the offset being determined by the sign of the most recent 
deviation of the predicted sample from the actual sample. This zero-order 


offset predictor is shown in Fig. 3.5 (Ref. 21. 
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C. FIRST-ORDER PREDICTOR 

The last predictor to be examined is the first-order predictor [Ref. 2, 
9, 13]. As the name implies, the first-order predictor predicts that each 
succeeding sample will be the same as the mee sample plus the 
same change that occurred between the two preceeding samples. For the 
first-order predictor equation (3-1) becomes 

een Do 
ec m GR. - X3 
mei Xps 

pince XX, represents the ‘Change thatsoceured between fnedimo préviens 
samples, then the predicted sample becomes the previous sample plus the 
Same change that occured between the previous samples, As in the zero- 
order predictor, in order to avoid errors from the redundant samples, 
“cr Xa or both musi be the transmitted samples ratherGianm tire 
redundant samples. In graphic terms the first-order predictor would be 
implemented by drawing a Straight line between two samples and then 
predicting that the next sample will fall on an extension of this straight 
line within specified error limitations. The first-order predictor is shown 
in Fig. 3.6. If the actual sample falls outside the tolerance limits then 
it is transmitted and the next sample is predicted to lie ona straight line 
connecting the previous two samples. As long as the samples remain 
redundant, or within the specified tolerance limits, then the straight line 
is extended indefinitely until a sample that falls outside the tolerance 


limits occurs. 
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DI ee NCEGU ION 

lt follows that the second-order predictor would go back one more 
sample and make allowance for the change in the change from the previous 
data samples. The system becomes very — as n increases and for 
Simplicity purposes only the zero-order and the first-order predictors 
were examined. 

Although the zero-order predictor is the only data compressor in 
actual operation there are many shortcomings of predictors in general. 
The basic problem is that for high data activity almost every sample is 
transmitted which could result in data expansion, and for low data 
activity relatively few samples are transmitted. This means that the 
transmitter must be able to accept a very fast sample rate at a time t, 
and at a time t+ At, the transmitter must be able to work efficiently at a 
very slow sample rate. Obviously this will require a fairly complex 
system. Another problem is that in order for a synchronous compressor 
to be efficient it must be sampled at a very high rate (i.e. For an error 
of one-tenth of one percent, the function must be sampled over 6,000 
times the highest-frequency component in the signal lRef. 9].) Prediction 
also tends to amplify noise so that during high or low data activity the 
prescribed error tolerence will not be constant, Although many authors 
give methods for obtaining the optimum linear predictor, they all agree 
that in general the power spectrum of the compressed signal is indeter- 
minant before compression can be applied, and, therefore, by assuming a 


power spectrum a suboptimum predictor will result. 
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IV. INTERPOLATORS 
A. INTRODUCTION 

Prediction techniques are educated guesses based only on the 
assumption that the data will remain relatively constant from one time 
interval to the next. If the sampled data is randomly distributed or 
corrupted with noise, then the redundancy reduction efficiency of the 
predictor will be relatively low for acceptable accuracies. A higher 
compression ratio can be obtained by relying on future samples as well 
as past samples. This process of determining redundancy after the sample 
has been examined is termed interpolation, 

Interpolation requires that a time-delay element be present in the 
system which might be difficult to implement in certain compressor sys- 
tems. Interpolation uses present samples to determine where past samp- 
les should have been and compares this prediction to the actual position 
of the past sample. In some interpolators, the length of time delay 
needed is directly related to the number of redundant samples, 

One advantage of interpolation over prediction is the increased 
signal-to-noise ratio which may be TU by interpolators, Most 
predictors require that the transmitted sample be used as the basis for 
the next prediction. If this transmitted sample is contaminated with 
noise then the predicted value is not accurate. Interpolators use the 
succeeding samples to determine redundancy; thus, the transmitted 


sample that was contaminated with noise has little effect on the 


approximation. Predictors amplify noise by predicting that the next 
sample will include added noise. As a result interpolation techniques are 
not as susceptible to system noise as prediction techniques. 

The compression ratio is higher for pi than for predictors, 
Once again this is due to the fact that the sample is not transmitted until 
at least one succeeding sample is examined. This insures greater accur- 
acy which results in more redundant samples, thus increasing the com- 
pression ratio. These advantages make interpolation the most desirable 


form of data compression, but the delay problem has impeded its progress. 


bx: ZEKO-ORDER TNTERPOLEATOR 

The first interpolator to be examined is the zero-order interpolator 
which is very similar to the zero order predictor (Ref. 10]. As in the zero- 
order predictor the redundant set of samples is represented by a horizon- 
tal straight line. The transmitted sample is not the actual sample but is 
the average of the most positive sample and the most negative sample in 
the redundant set. The transmitted sample can be expressed by the fol- 


lowing; 
etur 


Y, = 2 


where 
Y, = transmitted sample 
Ya = largest sample value in the redundant set 
Y, = smallest sample value in the redundant set 


The zero-order interpolator compares each sample to the preceeding 


Samples in the redundant set, determining the largest and smallest 
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samples in the set. When the point is reached where Y - ES >E, the 
n = 

predefined error, the sample Y, is transmitted. The spread that can be 

tolerated in the zero-order interpolator is strictly dependent upon the 


predefined error, The zero-order interpolator is shown in Fig. 4.1. 


O, FIRST-ORDER INPERPOLATOR 

The other linear interpolator is the first-order interpolator [Ref. 9, 
10]. Higher-order interpolators are very complex with a relatively small 
increase in the compression ratio (Ref. 9]. The first-order interpolator 
represents the redundant sample set by a straight line. There are two 
possibilities of representing a data set by a first-order interpolator. The 
first of these is the four-degree-of-freedom interpolator in which there is 
freedom of both the starting and ending points. The starting and ending 
points ara computed so that the straight line will approximate as many 
Samples as possible within the error tolerence limit. This process is also 
termed the Chebychev approximation. The four-degree-of-freedom first- 
order interpolator is shown in Fig. 4.2a. The computed straight lines can 
be connected by extending either the starting or ending points of succes- 
sive straight lines. The two-degree-of-freedom first-order interpolator, 
shown in Fig. 4.2b, is the same as the four-degree-of-freedom first- 
order interpolator except that one end of the straight line is anchored 
whereas the other end is free to move, so that the line can represent the 
maximum number of samples, The predefined error is the limiting factor 


in both first-order interpolators. As shown in the figures, the four- 
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degree-of-freedom first-order interpolator can cover a wider range of 
signal that the two-degree-of-freedom first-order interpolator. 

The two-degree-of-freedom first-order interpolator is less complex 
than the four-degree-of-freedom interpolator but since there is no justi- 
fication in weighting some samples more heavily than others, the four- 
degree-of-freedom interpolator would give better redundancy reduction, 
One problem with these interpolators is that they require an enormous 
amount of computation time which could put serious restrictions on the 


computers which might be employed in the system, 


D. FAN METHOD OF INTERPOLATION 

Gardenhire proposed a method of interpolation which reduces the 
number of computations required for the above interpolators, but in 
essence produces the same results [Ref. 9], This technique, termed the 
fan method of interpolation, operates similarly to the two-degree-of- 
freedom first-order interpolator. The calculations involve computing two 
slopes, both originating from the last transmitted sample, to the nth 
sample plus a specified tolerance and to the nth sample minus a Specified 
tolerance. A line is then drawn between the last transmitted sample and 
sample n+l. If this slope lies between the first two slopes then sample 
nis redundant. If it is redundant then two new slopes are calculated 
from the last transmitted sample to sample n+2 plus and minus the toler- 
ance limits. Then the slope to sample n+3 is compared to the new slopes 
and the old slopes. If this slope does not lie in the fan which is defined 


as the outer extremities of all the computed slopes, then the sample n+2 
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is transmitted and becomes the new starting point for succeeding samples. 
If this slope lies in the fan then the sample is redundant and the fan is 
extended by computing two new slopes. This process is repeated until a 


Significant sample occurs. This method is shown in Fig, 4.3, 


E. CONCLUSIONS 

Since interpolators use the knowledge gained from past samples to 
determine redundancy, they have a distinct advantage over predictors 
which base their calculations on future predictions. Interpolation 
requires that the data compressor system incorporate a time delay to 
insure that samples are not transmitted immediately but delayed until 
Succeeding samples can be examined. Predictors tend to amplify noise 
as each prediction is made, whereas interpolators would reduce this 
effect and would require a much lower signal-to-noise ratio. Predictors 
use only past transmitted samples as a basis for future predictions. If 
this sample contains noise, then this noise will be predicted to occur in 
the next sample. This pattern could continue at each succeeding sample, 
giving a net effect of noise amplification. This means that interpolators 
would be able to operate efficiently in a much noisier environment than 
predictors. The problems of increased complexity and time delay have 
impeded the progress of implementing an interpolator. 

The data compressors were simulated on the SDS 9300 digital com- 
puter. The four-degree-of-freedom interpolator was not examined be- 
cause of its complexity, but the two-degree-of-freedom interpolator was 


simulated. (Complete digital flow diagrams of all the data compressors 
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FIGURE 4.3 FAN METHOD OF INTERPOLATION 
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examined are contained in Appendix A.) Various input signals were simu- 
lated including a sine wave, an exponential and a random test input sig- 
nal which is described in Chapter VI. The results of the test were based 
on the compression ratio versus the predefined error. 

The results for the sine-wawe simulation are shown in Fig. 4.4. 

The figure shows that for the predictors examined, the zero-order floating- 
aperture predictor has a lower compression ratio than the first-order pre-. 
dictor. This is not always the case. A simulation by Medlin on telemetry 
data showed that the zero-order floating-aperture predictor has a higher 
compression ratio than did the first-order predictor [Ref, 2]. This is one 
of the main reasons why the zero-order floating-aperture predictor has 
been the only implementation to date. The other predictors perform as 
expected with the zero-order fixed-aperture predictor having the lowest 
compression ratio. The same figure shows that all three interpolation 
techniques had a higher compression ratio than did any predictor. The 
basic reasons for this were explained earlier in this chapter. The per- 
formance of the fan method of interpolation was almost identical to that 

of the two-degree-of-freedom first-order interpolator. 

The other two input signals gave similar results. The main difference 
was a shift in the curves. The curves shifted upward for the exponential 
input and shifted downward for the random input. The differences in 
information rate caused the observed shifts. Compression results when 
the input signal does not occupy the entire prescribed bandwidth. The 


random input signal does not allow for as much compression as does the 
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FIGURE 4.4 A COMPARISON OF DATA COMPRESSORS 
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exponential input because it occupies a greater percentage of its pre- 
scribed bandwidth. 

The simulation showed that various types of data could represent a 
redundancy problem, Data compression seems to be a very effective 


means of reducing this redundancy problem. 
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V, CONTINUOUS SECANT METHOD 

A. INTRODUCTION 

Another technique for reducing redundant samples is to perform 
compression on the analog signal prior to sampling. This would ease the 
load of high data rates on digital computer computation and the amount 
of digital memory. This technique has been shown to be more efficient 
and it gives higher compression ratios than other methods previously 
described. Another advantage of reducing redundancy prior to sampling 


is that the sampling rate will be reduced thus easing the load on the 


sampling mechanism. 


B, FORMULATION OF THE CONTINUOUS SECANT METHOD 

The last data compressor to be examined is the continuous secant 
method of compression which will be presented much more detail than 
the previous data compressors [Ref. 9, 12], Most signals of interest 
can be broken up into a series of concave or convex signals. The only 
assumption to be made in order to employ the continuous secant method 
is that the signal be continuous and have a fixed direction of concavity 
which changes a finite number of times in the continuous interval. This 
compressor, unlike the previous compressors that were simulated, acts 
on the continuous Signal prior to sampling by determining the maximum 
interval between samples in which the function does not exceed a pre- 
defined tolerance level. This concave or convex interval is approximated 


by a secant such that the secant does not deviate from the continuous 


signal by more than the predefined maximum absolute error. Three slopes 
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are needed in order to determine the optimium sampling interval. For the 
strictly concave case ,. M,(t) is the slope of the line connecting the con- 
tinuous signal F(t) and the point A where A represents the beginning of the 
interval. The continuous secant method of interpolation is shown in 


Fig. 5.1. In equation form, 


F(t) - (F(t,)+E) 





M; (t) = (5-1) 
t-to 
where 
to > beginning of the interval 
t = maximum sampling interval 
E = predefined error 


The second slope needed for the concave case is the line connecting 


the signal with point B as seen in Fig. 5.1; or in equation form, 
ER) = DEO) 
ETS 


Me (t) = (5-2) 


From Fig. 5.1 it is obvious that M,(t) is the maximum value of the 
slope from point B to the signal, F(t). This slope is equal to the slope 
of the tangent at the point T, (Ref. 9]. Equating slopes, M,(T) = Max 
M, (t), for t>t,. The limiting equation then becomes 

Max M,(t) < M, (t) tus (5 -3) 
For the convex case, the third slope is needed and is represented by 


the following; 
ROMA En 
LEG 


Ms (t) = (5-4) 


This slope now represents the minumum value of the function over 


the interval in question and is thus termed Min M,(t). As described 
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FIGURE 5.1 CONTINUOUS SECANT METHOD OF INTERPOLATION 
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in the reference cited for the continuous secant method, the slope of the 
tangent at the end of the interval is equal to Ms (t) or; 
Min Ma(t) > Matt) toto 
The maximum interval between samples is thus determined by the 
following relationships, depending on their direction of concavity; 
Max M,(t) < Matt) 
t >t, 
Min Ms (t) > Matt) 
All four possibilites of continuous intervals appear in Fig. 5.2. 


With modern hybrid computers the continuous secant method can be simu- 


lated as shown in Ref, 12, 
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FIGURE 5.2 THE FOUR POSSIBLE RELATIONS FOR THE CONTINUOUS 
SECANT COMPRESSOR 
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VI. NOISE IN DATA COMPRESSION SYSTEMS 
A. INTRODUCTION 

Up to this point compression techniques have been discussed with- 
out the presence of noise, Noise is often neglected in the study of data 
compression because the presence of noise tends to reduce the compres- 
sion ratio; and secondly, noise has a very pronounced effect on systems 
which use derivatives in determining redundancy. 

A data compression.system has a defined prescribed bandwidth which 
is dependent on the highest-frequency component the input signal is 
expected to contain. High-frequency noise beyond the sampling rate and 
outside of this finite bandwidth can be filtered out extremely well using 
conventional filtering techniques. Compression results when the system 
does not occupy its entire predefined bandwidth. Since one cannot filter 
out the noise in the prescribed bandwidth there is a problem of deter- 
mining the effects of this noise on data compressors. 

As was stated in Chapter III predictors tend to amplify noise and as 
a result they would not give an accurate representation of effects of noise 
on compression. In all of the interpolators discussed prior to the con-: 
tinuous secant method, the predefined error originated at the last trans- 
mitted sample. In the continuous secant compressor, the predefined 
error is placed at the point of maximum deviation of the signal from the 
predicted pattern. This is the main reason why the continuous secant 


method of compression is the most efficient of all compressors examined. 
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Reference 7 contains a complete analysis comparing the noise level 
to the number of significant samples required to reproduce the input sig- 
nal to within a specified tolerance. It shows that if the RMS noise level 
is equal to the predefined error in the continuous — compressor, 
there is an increase of only 20% in the sampling rate, The slope of the 
line M,(t) determines when the next sample will occur. A good measure 
of the effects of the noise on the compressor would be the probability 
distribution of the slope of the line Ma(t) since it is the limiting factor 


- in determining the optimum sampling interval. 


B. CHARACTERISTICS OF THE INITIAL TEST SIGNALS 

A simple sine wave was chosen for the input signal in the initial 
test. The sine wave has a continuously varying slope which should give 
good results for piecewise examination. The frequency of the sine wave 
is one cycle per second and the magnitude is unity. 

The characteristics of the noise which was added to the sine wave 
are shown in Fio, 6, le The RMS ice is the root-mean-square value 


of the noise as compared to the input signal. The bandwidth of the noise 


spectrum is twenty cycles per second, centered about tne origin. 


€. RESULT SPOOF PRE ONITIAL Test 

The distribution of the slope was first examined for a purely concave 
interval represented by the first Quarter of the sine wave, Various pre- 
described errors were used coupled with different levels of noise. The 


effects of the noise on the slope can be seen in Fig. 6.2a. 
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FIGURE 6.2a THE SLOPE ERROR DISTRIBUTION OF THE FIRST QUARTER 
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At the beginning of the sine wave the slope rises very fast and as 
the sine wave reaches ninety degrees the slope has decreased to zero. 
The results of Fig. 6.2a show that the steep slope at the beginning of 
the sine wave had a pronounced effect on the distribution of the slope, 
Although for very small errors the distribution of the slope centers on the 
actual value of the slope with little variation observed, for significant 
errors the slope distribution tends to move toward a much higher value. 
This is due to the variation of the initial steep slope. As the noise is 
increased slightly, the average slope moves towards the direction of 
concavity until a point is reached where the distribution of the slope is 
completely random. This means that the input signal is completely 
saturated with noise and cannot be reconstructed with any degree of 
accuracy. 

The sine wave was then shifted by one quarter of a cycle, Although 
the segments are the reciprocal of each other, the results were not iden- 
tical. The difference in the distribution is due to the initial slope. As 
was shown above, the initial slope had a pronounced effect on the distri- 
bution. As shown in Fig. 6.2b, as the noise is increased the distribution 
shifts slightly to a more negative value. This is due to the increase in 
the negative slope. As the predefined error is increased more of the 
signal is included in the continuous interval, and the effects of the steep 
negative slope take hold. 

The next logical step was to examine the inflection point. The 


results were that the distribution was nearly gaussian. If the portion of 
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FIGURE 6.2b THE SLOPE ERROR DISTRIBUTION FOR THE SECOND 
QUARTER OF THE SINE WAVE 
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the signal is centered so that the inflection point occurs in the middle 

of the test signal. Of course this is not true for every inflection point 
on other input signals, but since the sine wave provided symmetry about 
the inflection point, the distribution is gaussian until the system is com- 
pletely saturated by noise. By displacing the inflection point off center, 
the distribution is still approximately gaussian, but by adding noise the 
slope is offset in the direction of the maximum deviation. (See Fig. 
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IDA EXAMINATION OF THE SLOPE IN A COMPLETELY RANDOM ENVIRON- 
MENT 
1. Characteristics of the Input Siomal!s 

In order to examine the probability distribution et the" silo peri” 
a completely random environment, a random test signal was chosen. A 
signal composed of the sum of eight sine waves with no overlapping 
harmonics has the following characteristics; [Ref. 9] 

a. a flat spectral density 


b. a probability distribution which is identical with a gaussian 
distribution 


The frequencies of the new test signal are as follows: 
o _i=8 
eas t)= 3 | Sin (Wt) 


noise i=] 


i 


W, = .1122 radians/sec 
Wo. = .1995 radians/sec 
W, = „2860 radians/sec 
W, = „3940 radians/sec 
Ws = „4910 radians/sec 
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We = .6100 radians/sec 
W. = .7320 
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The noise has the same characteristics as used in the previous 
examples. Since the noise bandwidth is still twenty cycles per second, 
it covers the entire range of the carrier frequencies and has ample width 
to cover significant harmonics. In order to approximate a continuous 
signal on the SDS 9300 digital computer a sampling rate of five hundred 
cycles per second was chosen, A fourier analysis of the input signal 
showed that the signal had a flat spectral density. The amplitude of 
the input signal is shown in Fig. 6.3. 

To insure a random starting point another gaussian distribution 
was used. As was seen in the initial test, the starting point can have a 
pronounced effect on the distribution of the slope, The distribution of 
the starting point had a gaussian distribution with a standard deviation 
equal to five seconds. The positive half of the gaussian distribution 
was used in order to insure a positive starting point. The input signal 
was examined from zero to fourteen seconds, At each random point the 
slope was computed for the signal without noise and the signal plus 
noise. The actual slope was compared to the slope of the signal plus 
noise and the resultant error was determined. 

2. Results of the Random Signal Test 

The results of the error distribution show that the slope error 

had a fairly good gaussian distribution. The one exception was that for 


small predefined errors the probability distribution seemed to shiit 
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slightly to a smaller slope for high noise, and for' high predefined errors, 
the distribution seemed to shift slightly to the higher slopes, There is 
no justification for this result except for the following: 

a, Onlv a relatively small number of samples were examined 


b. The random generator might have been biased for certain 
values 


One can conclude that the overall distribution of the slope 
error is generally gaussian in form, (See Figs. 6.4-6.4e for a detailed 
analysis of each individual run) 

The following set of graphs (Figs. 6.5a-6.5e) present a slightly 
different interpretation of the slope error distribution. The horizontal 
axis represents the percentage of RMS eise as compared to the prede- 
fined error, and the vertical axis represents the probability that the slope 
will fall between the specified error limitations labeled on each curve. 
As is shown in the graphs, when the predefined error is increased the 
Slope of each of the curves drops significantly as compared to the same 
curve with a lower predefined error. As an example, consider a proba- 
bility of 60% that the slope error will be less than + ,0001 , and that the 


system has a predefined error tolerance of .01; from Fig. 6.5b it is found 


that the ratio of the RMS ci 


dm 


E. to the predefined error can be as high as 

Seo, Or the RMS LA can be ,0085. For the same probability with a 
noi 

predefined error of.1, the RMS ise can now be ,06 which is an increase 


in predefined error of ten. These results indicate that by increasing the 


predefined error and keeping the noise level constant, the probability of 
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the slope being within certain limits increases and a more efficient data 
compressor results, There is another factor that must be considered to 
fully comprehend the meaning of the last statement. Increasing the 
predefined error causes the following: 


a. The time between samples will increase (provided a fairly 
random input signal is used): and 


b. The reconstruction of the original signal will not be as 
accurate as the system may require. 


Although the samples may be fairly accurate, the information that was 
considered redundant in this system might actually be of value. Suppose 
that this data compressor was operating in fairly high noise, and this 
system was monitoring a heartbeat. The predefined error is high due to 
the high noise level, but the svstem still picks up the heartbeat. The 
only problem is that all the heart murmurs and small echoes would go 
undetected, As another example, consider the temperature in a space- 
craft which might be oscillating at a very low amplitude below the pre- 
described error, Although the system would give better results for a 
higher value of noise, the presence of this oscillation might go undetected 
and it could be damaging to the spacecraft, Therefore, there is a trade 
off, or an optimum point where the predefined error will be operating, low 
enough to be sensitive to significant changes in the system, and high 
enough to keep the noise from interfering significantly with system. 


efficiency .« 
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pe PERCENTAGE OF SLOPE ERROR DiISTRIBMTICN 

In the past examples, the slope error had no relation to the true slope. 
In order to clarify the distribution of the slope as compared to the actual ` 
value of the slope, and not to the slope error, a second set of tests was 
applied. These tests differed in that the slope error was divided by the 
actual slope to determine the percentage error. 

The results of these tests gave much more accurate data than the 
previous tests. These graphs (Figs. 6.6a to 6.6f) do not have any signi- 
ficant shift in the distribution; they are fairly gaussian. The only reason 
for minor variations from a true gaussian pattern is that only one hundred 
samples were chosen at random, instead of the large numbers required to 
give a good gaussian pattern. Since these graphs are fairly similar in 
form, a correlation was discovered between each set of curves. (The re- 
sults that were used in determining the curves are shown in Table I) The 
results show that there is actually only one set of gaussian curves. By 
doubling the predefined error and increasing the RMS noise-to-predefined- 
error ratio (Err/E) by a factor of two, the probability distributions were 
almost identical. This means that by doubling the predefined error, the 
noise can be increased by a factor of four and the same gaussian distri- 
bution will occur. This agrees with the previous test result, that the 
greater the predefined error, the noisier the environment san be, The 
reason for this is fairly obvious. As was stated earlier in the chapter, 
the higher the predefined error the greater the time between samples. The 


more signal to be examined, the higher the probability that the noise will 
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exceed the predefined error in a time interval t. 


fm CONCLUSION 

The preceeding results show that for the continuous secant compres- 
sor, the factor that affects the efficiency the most is the predefined 
error and not the RMS noise. The compressor depends twice as much on 
the predefined error as it does on the noise. There is an optimum point 
where every compressor should operate, depending on the specifications 
of the system. Finally, the conclusion is made that the distribution of 
the slope of M,(t) in the presence of noise is approximately gaussian. 

In summary, Figures 6.7 and 6.8 show the final probability distri- 


bution as a function of slope error and predefined error. 
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FIGURE 6.7 PROBABILITY DISTRIBUTION OF SLOPE PLUS NOISE TO 
WITHIN 1% OF TRUE SLOPE 
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FIGURE 6.8 PROBABILITY DISTRIBUTION OF SIGNAL AND NOISE 
FOR A SLOPE ERROR OF .0001 
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VII, CONCLUSIONS AND RECOMMENDATIONS 


ed 


A, CONCLUSIONS 

Interpolation techniques give a better compression ratio than predic- 
tion techniques, but the increased complexity of interpolators make them 
difficult to implement. The continuous secant compressor, which deter- 
mines the optimum sampling interval before sampling, proved to be very 
efficient even in the presence of noise. A Straight-line approximation is 
the basis for the continuous secant compressor. In the presence of white 
gaussian noise, the distribution of the slope of the straight line hada 
fairly gaussian distribution. The distribution of the slope was dependent 
upon the slope of the input signal at each sample. A completely random 
input signal composed of the sum of eight sine waves gave a good gaus- 
Sian distribution. Although the white noise had affected the slope distri- 
bution, the probability distribution depended largely on the predefined 


OT, 


B. RECOMMENDATIONS 

The question that now remains is what is the probability that the 
Signal plus noise will exceed the predefined error within a specified time 
interval. The graphic representation ci this statement appears in Fig. 
7.1. Although the analytical results were not formulated, the experi- 
mental results showed that the probability distribution was gaussian for 


a completely random input. Since E remains constant from run to run, the 
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FIGURE 7.1 PROBABILITY OF EXCEEDING THE PREDEFINED 
ERROR IN A SPECIFIED TIME INTERVAL T 
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probability distribution of the slope can be directly.related to the proba- 
bility distribution Gite time axis TINIIS proba mS tribution oi 
t-t,, the time when the signal plus noise exceeds the predefined error E, 
becomes a good measure of slope distribution. ae that the input 


signal contains white noise, the equation for the slope becomes 


Male E ee, 7-1) 
t-to 
_ Fi) - Ft)  Nft) - N(t9 
tata È t-to 
N 4 
Malt) ERROR 


where 


Mə (t) = slope of Ma(t) with noise added to the signal 


N (t) — white noise added to the signal 

F (t) = input signal 

SU = ime Wien the Signal plus noise- -exceeds the 
predefined error, optimum sampling interval 

us = beginning of the time interval 


As stated earlier in this tnesis, as the predefined error is increased 
the optimum sampling interval is increased. Equation 7-1 shows that by 
increasing the time interval, the slope error is reduced, This agrees 
with the experimental results obtained in Chapter VI, The input signal 
also affects the sampling interval. A random input signal should have a 
shorter sampling interval than a fairly constant input signal. Equation 
7-1 also shows that the slope error distribution is dependent upon the 
input noise, The noise at the beginning of the interval is subtracted from 


the noise at the end of the interval. If N(t) = N(to) then the slope error 
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would be zero but the magnitude of each sample would be in error by the 
amount N(t). If the noise is sufficiently small then the resultant slope 
error should be sufficiently small. 

There is an optimum operating point for each system. The predefined 
error should be made as high as possible and still be within the system 
error specifications for reconstruction of the input signal, and at the 
same time, the RMS noise should be kept toa minimum. A noisy envi- 
ronment would require a high sampling interval to give a small error. 

The noise N(t) has a gaussian distribution, and as a result N(t) - 


N (to) also has a gaussian distribution. In order to prove that the slope 
tato 


error has a gaussian distribution, must have a gaussian 
distribution. 
It is recommended that studies continue in the field of data compres- 


sion due to the fact that it is an increasingly valuable tool which has 


future potential in the field of communications and space telemetry. 
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